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The gene encoding a 0-A^acetyIgIucosaminidase from 
Streptococcus pneumoniae has been obtained by screen- 
ing an expression library for 0-N-acetylglucosaminidase 
activity. Clones of different nucleotide sizes each having 
arylglycoside activity were obtained, and DNA sequenc- 
ing revealed a gene of 3933 base pairs possessing typical 
bacterial transcription initiation and termination se- 
quences and terminating in an ochre stop codon. Com- 
puter analysis of the translated protein of 1311 amino 
acids (144,210 Da) identified a tandem repeat within 
which lies a sequence homologous with six other hex- 
osaminidase gene products from a wide variety of spe- 
cies ranging from bacteria to humans* Also found were 
an amino-terminal putative secretion signal peptide and 
a carboxyl-terminal cell sorting/anchorage motif typi- 
cally found in over 20 other Gram-positive surface pro- 
teins. The expression of an almost complete DNA clone 
in Escherichia coli produced a functional and authentic 
0-JV-acetylglucosaminidase with aglycon specificity 
identical to the wild-type enzyme. However, enzymes 
produced from truncated DNA clones show more re- 
stricted aglycon specificity and are unable to hydrolyze 
terminal 01-2GlcNAc residues from iV-glycans contain- 
ing a bisecting JV-acetyl glucosamine. The availability of 
these clones allows structural analyses to be made of 
catalytic and oligosaccharide recognition protein do- 
mains that enhance functional activity. 



0-A^AcetylgIucosarninidase, found in both the culture me- 
dium and in association with undisrupted cells, is one of six 
extracellular Streptococcus pneumoniae glycosidases purified 
to date (1-3). This repertoire of enzymes, in addition to the 
p-TV-acetylglucosaminidase, includes a 0-galactosidase, three 
endoglycosidases, and a neuraminidase that are thought to aid 
the organism in the breakdown of oligosaccharides in its sur- 
rounding environment for use as a carbon source. In particular, 
the neuraminidase has been implicated in the process of patho- 
genesis. By cleaving the terminal sialic acids on cell-surface 
glycoh'pids, it is thought that the neuraminidase can expose the 
carbohydrate Hgand by which the S. pneumoniae attaches to 
the host (4, 5). S. pneumoniae also synthesizes and secretes a 
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hyaluronidase that is capable of degrading a component of the 
extracellular matrix and may aid the organism to invade un- 
derlying host tissue (6). Since GlcN Ac/31 -linked residues are 
common components of several cell-surface molecules in host 
tissues, it is possible that 0~7V-acetylglucosaminidase also has a 
pathogenic role. 

All of these glycosidases are important reagents for oligosac- 
charide analysis. 0-N-Acetylglucosairurudase has been demon- 
strated to have broad specificity using GlcNAc-Gal, hydrolyz- 
ing both 01-3 and 01-6 linkages that are commonly found in 
mucins. By contrast, the hydrolysis of AMinked sugars at low 
enzyme concentrations is restricted to GlcNAc01-2Man link- 
ages except when the Man-al-6 arm is substituted with a 
GlcNAc at both C-2 and C-6 positions, or the Man 0-linked to 
the chitobiose core is substituted at the C-4 position by a 
bisecting GlcNAc (7). This limited activity can be a useful tool 
in the sequencing of oligosaccharides, giving specific structural 
information about the cleaved bond and the surrounding sugar 
residues. It also raises the issue of defining the intrinsic prop- 
erties of the enzyme that govern this phenomenon. The aim of 
this study was to use an expression cloning strategy to obtain 
the gene for 0~A/-acety]glucosaminidase and to investigate the 
interaction between the enzyme and its TV-linked carbohydrate 
substrates at the molecular level. By the expression of trun- 
cated portions of the fi-N- acetyl glucosaminidase gene, we have 
been able to examine defined regions of the protein that deter- 
mine glycosidase specificity. 

EXPERIMENTAL PROCEDURES 

Moterwj*-Oligonucleotides were synthesized by British Biotechnol- 
ogy (Oxford, United Kingdom). Restriction and DNA-modifying en- 
zymes were purchased from Boehringer Mannheim and New England 
Biolabs. Ractionucleotides were obtained from Am er sham Corp. and 
ICN-Flow (High Wycombe, Bucks, UK). Prime-It random primer DNA 
labeling kit was purchased from Stratagene. 

Bacterial Strains — XL 1 -Blue Escherichia coli strain (rccA- (recAl, 
lac-, endAl, gyrA9B, thi, hsdRll, supEte, relAl, {F proAB, lacF, 
lacZM&Xb, TnXO))) was purchased from Stratagene. NM554 E, coli 
strain {recAlZ, araD139, *(ara'leu)lGQ&, A(lac)17A, galU t golK, hsdR, 
rsplXstrr) mcrA, mcrB) was obtained from Stratagene. 

Purification of p-N-Acetylglucosaminidase—A homogeneous prepa- 
ration of the cellular form of 0-JV-acetylglucosaminidase was obtained 
from 5. pneumoniae (ATTC 12213) using previously published methods 
(3). A limited trypsin digest of 0-Jv*-acetylgIu£osaminidase followed by 
amino-terminal amino acid sequencing was used to obtain two tryptic 
peptide sequences, a major sequence, EGADIPIIGGMVA, and a minor 
sequence, LQPMAFND. Samples for SDS-polyacrylamide gel electro- 
phoresis wore prepared by methods previously described (3). 

Construction and Screening of Genomic Expression Library— 
Genomic DNA was prepared from frozen S. pneumoniae cells (1 ml of 
packed cells) following a method used for tissue samples (8). Extracted 
genomic DNA was partially digested with AluJ and separated by aga- 
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rose gel electrophoresis. Fragments ranging between 1 and 7 kb l were 
gently extracted from the agarose by adsorption to glass beads using the 
Genedean II Kit (BiolOl, Inc., Stratech Scientific, Luton, UK) to avoid 
shearing. The ends of the si jo- selected pieces were then blunt-ended 
using Klenow and T4 DNA polymerase (8). BcoM linkers were Ugated 
to blunt-ended genomic DNA fragments with T4 DNA ligase, and 
these inserts were Ugated Into -EcoRI-digested A-Zapn vector arms 
(Stratagene) and packaged with Gigapack® II packaging extracts 
(Stratagene) to generate a A bacteriophage expression library. The 
number of primary recombinants was determined by plating the library 
on the XLI-Blue E. coli host strain. The plated expression library was 
screened for p-iV-acetylglucosaminidase activity by incorporation of 50 
|iM 4-methylumbelliferyl JV-acetylghicosaminide in the top agarose and 
visualization of hydrolysis by UV light. Positive clones were plaque- 
purified by successive plating, and pure plaques were aubdoned into 
plasmid Bluescript SK— by the in vivo excision protocol that accompa- 
nies the vector (Stratagene). 

Rescreening of Library for Overlapping Clnnes — The library was 
plated and screened by plaque hybridization (Colony/Plaque Screen 
nylon discs, DuPont NEN) using DNA probes that were [o-^PldATP- 
labeled using the Prime-It random primer labeling kit (Stratagene) and 
purified over Select-D®, G-25 spin columns (5 Prime -* 3 Prime, Inc., 
CP Labs, Bishop's Stortford, Herts, UK). DNA probes were prepared 
from the 5' end of pBStrH7 by restriction digesting with Bglll and 
fifcoRI and from the 3' end of pBStrH17 by restriction digesting with 
Ncol and EcoBJ to generate probes of 570 and 730 bp respectively. 

Plasmid DNA Purification — Clones were amplified and purified 
using the roaxi-prep pZ523® spin column plasmid purification kit (5 
Prime -> 3 Prime, Inc.). DNA was also purified using Magic Mini Preps 
(Promega). 

Southern Blot Analysis of Clones— Southern transfer of DNA to nylon 
membranes (Hybond, Am er sham Corp.) was made according to pub- 
lished methods (8, 9). Oligonucleotide probes designed from the amino 
acid sequence of the major tryptic peptide from the purified protein, 
GAA GC<T/A) GC(T/A) GAT AT(T/C) CCCA/T) AT(T/C) AT(T/C) 
GG(T/A) GG(T/A) ATG GT, were labeled with [y- a3 F]dATP using 
polynucleotide kinase (8). After hybridization at 56 °C overnight, niters 
were washed and subjected to radioautography. 

DNA Sequencing — Nucleotide sequences were determined using the 
dideoxy chain termination method (10) using the Sequenase DNA 
sequencing kit (U. S. Biochemical Corp. Cambridge Bioscience, 
Cambridge, UK). Both strands of DNA were sequenced in their entirety 
by a combination of specific oligonucleotide primers and the transposon- 
facilitated DNA sequencing TN1000 nested set kit (Gold Biotechnol- 
ogy, Inc., St. Louis, MO). Sequence analysis was accomplished using 
MacVector and Assembly-Lign software (IB1, Cambridge, UK) and on- 
line NCBI data bases. 

Northern Blot Analysis— Total RNA was prepared from S. pneu- 
moniae using guanidinium thiocyanate (11). RNA (10 pg) was sepa- 
rated on a denaturing formaldehyde agarose gel with RNA molecular 
sizing fragments ranging from 0.24 to 9.5 kb and Northern blotted. The 
nylon filter was probed with a 670-bp M P-labeled BgUl-Ecoja fragment 
used to rescreen the genomic library. 

Subcloning into the pGX Expression Vector— The vector pGEX-3X 
(Pharmacia, Milton Keynes, UK) was digested with EcoRI and dephos- 
phorylated. The inserts from clones pBStrH7, -8, and -17 were prepared 
by digestion with UcoRI and gel purification. The insert and vector were 
Ugated overnight and transformed into NM554 E. coli by electropora- 
tion. Colonies were ampicillin-selected on LB/amp plates, and 20 were 
randomly chosen for analysis. Mini-prepped DNA was digested with 
EcoBl and separated on an agarose gel for verification of vector and 
insert sizes. 

pOEX. Expression and Affinity Purification— The expression and pu- 
rification were carried out according to published methods (12). 50-ml 
cultures of E. coli transformed with pGEX clones were induced by the 
addition of isopropyl-l-thio-^D-galactopyranoaide (1 mM), and the cells 
were sonicated. Affinity beads were mixed with bacterial cell lysates for 
2 h at 4 °C and washed 3 times with buffer, and the fusion protein was 
ehited from the affinity beads by the addition of fresh 6 mM reduced 
glutathione. Where appropriate, factor Xa cleavage was conducted by 
incubating factor Xa (Denzyme, Aarhus, Denmark) at 1 mg/ml with 
beads coupled to fusion protein at room temperature overnight. Super- 



1 The abbreviations used are: kb, kUobase pairCe); StrH, 5. pneu- 
moniae hexosaminidase; 4-MU-GlcNAc, 4-methylumbelliferyl N-acetyl- 
glucosammide; bp, base pair(s); HPAEC, high performance anion- 
exchange chromatography. 



natants were assayed for enzyme activity after centrifuging the beads. 

Substrate Specificity Aesaya— Tritium-labeled biantennary, than ten- 
nary, tetraantennary, bisected-biantennary, and bisected-hybrid 
oligosaccharide alditols were obtained from Oxford GlycoSystems 
(Abingdon, UK). (GlcNAc01,4GlcNAc) 3 was obtained from a partial 
chitin hydrolysis, and a degalactosylated biantennary native oligosac- 
charide was purified from human asial ©transferrin (13). 

Native oligosaccharide substrate was incubated at different concen- 
trations (0.2-1.0 mM) with 2.5 miUiunita/ml of wild-type or recombinant 
0-JV-acetylglurosaminidase in 60 mM citric acid/sodium phosphate 
buffer, pH 5.0, containing 1 mg/ml bovine serum albumin at 37 °C for 
1 h. The reactants were desalted, and hydrolysis was monitored by 
Dionex HPAEC (Dionex BioLC system) using a CarboPac PA-1 column 
eluted at 1 mlAnin with 150 mM NaOH, 30 mM NaOAc, and the reaction 
products were detected UBing triple-pulsed amperometric detection 
with the following pulse potentials and durations: E } * 0.01 V (r t — 120 
ma), E a = 0.6 V (i 3 = 120 ma), and E 3 = -0.93 V (t 3 = 130 ms). The 
extent of hydrolysis was calculated from empirically derived response 
factors for substrate and reaction products, and the data were plotted 
using a weighted nonlinear regression analysis (Multifit 2.0, Day Com- 
puting, Cambridge, UK). Radiolabeled oligosaccharide alditols were 
separated using an iaocratic eluant of 200 mM NaOH and the fractions 
taken for radioactivity determination by scintillation counting. Bio-Gel 
P-4 chromatography (Oxford GlycoSystems) was also used to separate 
the reaction products after enzyme digestion, and the radioactivity in 
each fraction was determined as above. 

RESULTS 

Expression Cloning of StrH— An expression cloning strategy 
was adapted from published procedures (14) to obtain the gene 
encoding S. pneumoniae p-N-acetylglucoaaminidase. A S. pneu- 
moniae expression library of 5 X 10 s primary recombinants, 
generated in the vector A-ZapII (Stratagene), waa amplified in 
the E. coli host strain XLl-blue, and 300,000 clones were 
screened for /3-iV-acetylglucosaininidase activity using the sub- 
strate 4-methylumbelliferyl TV-acetylglucosaminide (4-MU-G1- 
cNAc). 32 positive clones were identified by fluorescent halos 
encircling each plaque when visualized under 366-nm ultravi- 
olet light. Hydrolysis was confirmed to be specific for A/-acetyl- 
glucosaminide by the inclusion of a control substrate 4-methyl- 
umbelliferyl xyloside in the screening protocol. No hydrolysis of 
this substrate was seen. From the group of positive clones, 20 
were selected at random and subcloned into pBluescript plas- 
mids via in vivo excision according to the manufacturer's pro- 
cedures and analyzed by restriction digestion to determine the 
sizes of the genomic DNA inserts. Restriction-digested DNA of 
the four largest inserts ranging in size from 1584 to 2504 bp 
were Southern blotted and probed with a degenerate ^-la- 
beled oligonucleotide designed from the amino add sequence of 
the major tryptic peptide (Fig. 1). Hybridization of the radio- 
labeled probe to each clone selected by the enzyme activity of 
its translated product confirmed that they encoded the same 
protein that had been biochemical purified from the 5. pneu- 
moniae cells (Fig. 1). These clones (Fig. 2a) were nucleotide 
sequenced and were found to compose a 2.7-kb continuous open 
reading frame with no start or stop codons, indicating that 
further screening of the genomic library was required to locate 
the missing gene sequences. 32 P-Labeled DNA probes from the 
5' and 3' ends of this 2.7-kb partial gene were used to identify 
two clones that contained the 5' and 3' ends of the $trH gene 
(Fig. 26). Complete sequence analysis of these three contiguous 
clones revealed a single, continuous open reading frame of 3933 
bp designated strH(Btr, streptococcus; H, hexosaminidase) (Fig. 
3). The ATG (methionine) start codon, designated +1, was 
identified by the positions of three consensus sequences: the 
AGGAGG Shine-Dalgarno ribosome binding sequence located 7 
bases upstream from the ATG translation initiation codon, and 
the two putative promoter sequences TTGACT resembling a 
"[minus]35* transcription initiation sequence beginning at -87 
and TATAAT a (minus]10 p transcription initiation sequence be- 
ginning at -43. The ochre stop codon located at base 3934 was 
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Flo. 3. strH DNA and amino acid se- 
quence and flanking regions. EcoKl 
and Alul sites flank both ends of the 
genomic sequence. Both tryptic peptide 
sequences are marked in boldface. Tran- 
scription initiation and termination se- 
quences are double underlined. Shine- 
Dalgarno sequence is underlined. The 
start and stop codons are marked by a 
dotted underline, and the A of the ATG 
start codon is designated +2. The two 30- 
amino acid tandem repeating regions ho- 
mologous with other hexosaminidases 
(Table II) are underlined. This sequence 
has been deposited in the GenBank data 
base (accession number L36923). 
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848 
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968 
028 
088 
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268 
328 
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448 
508 
568 
626 
688 
748 
808 
868 
928 
988 
2048 



CGCTCTAaaACTJU5TraATCCCCC^ 
AAAAATTGGGTCTACGCGGCGATCAAT^ 

ACGGTGTCGTTTATATCTACCATACATCCTACATCCCAGAACAATATATCAACGCAAATT 
ATCCAAACCTTGAAT ACTATAGTTCT ATCTATAATCGTTTC AACTTACACTAC C AC ATTC 
ACATOAATGATGAACACTTTOAAGAAATCAACGAAATAGTC 
CG^CTTCAGTCCTCGGAGTCGATOAACAA 
AACTCGAATCAACTCGCCAAGTTTTAGAATACAGTG 

AC AAGATTAAATTCJ^TCTC ATCTG ACCCTGATCACTAAG AAG AAAG C CTGAGCCTAATCG 
CTCGOTCTTTTTCTMCCTrATTATACC^ 
AAGATTC TATAATTT GTGGAATATTTrATAGAA^^ 
TATACGTATTTTITCCGACCAATCTGAAATCTT^ 
TTATTATAGTTTTTTGTC ATCTCC TCTTG ACTCTCGTTA^ 
CGTTGATTTATATAATGGCTATGAAT^^ 

M K H 

ATGAAAAACAACAGCGTTTTTCTATTCGTAAATA^ . 

EKQQRPSIRKYAVGAASVLI 
TTGGATTTGCCTTCCAAGCACAGACTCTTCCACCCGATGGAG^^ 

GFAFQAQTVAADGVTTTTBN 
ACCAACCGACCATCCATACAGTTTCTCATTCCCCTCAATCATCCG 

QPTIHTVSDSPQSSENRTBE 
AAACACCTAAAGCAGAGCTTCAACCAGAAGCTCCAAAAACIGTAGAAACA^ 

TPKAELQPBAPKTVETETPA 
CTACTGATAAC43TAGCTAGTCTTCCAAAAAC^GAAGA 

TDKVASLPKTEEKPQBBVSS 
CAACTCCTAGTG ATAAAGC AG AGGTGGTAACTCC AACTTCTG CTG AAAAAGAAACIX3CT A 
TPSDKAEVVTPTS AEKBTAN 
ATAAAAAGGCAGAAGAAGCTAGCCCTAAAAAGGAAGAAGCGAAAGAGGTTGATTCTAAAG 
KKABBAS PKKEEAXEVDSKE 
AGTCAAATACAGACAAGACTGACAAGGATAAACCAGCTAAAAAAGATGAAGCGAAAGCAG 
SNTDKTDKDKPAKKDEAKAE 
AGGCTGACAAACCGGAAACAC^G^CAGGAAAGGAACGTGCTGCAACTCTAAA 

ADKPETEAGKERAATVNEKt, 
TAGC3AAAAAGAAAATTCTTTCTATTGATGCTGGACGTAAATATTTC 

AKKKIVS IDAGR KYFS PEQL 
TCAAGGAAATC^TCGATAAAGCGAAACATTATGGCTACACTOATTTACACCTATTAGTCO 
KEII DKAKHYGYTDLH LLVG 
GAAATGATGGACTCCGTTTCATGTTGGACGATATGAGCATCACAGCTAACGGCAAGACCT 
NDGLRPMLDDMS ITANGKTY 
ATGCCAGTGACGATGTCAAACGCGCCATTGAAAAAGGTACAAATGATTATTACAACGATC 
ASDDVKRAIEKGTNDYYNDP 
CAAACGGCAATC AC^TAACAGAAAGTCAAATGAC AGATCTG ATTAACTATGCCAAAG ATA 
N G N H L TESOMTDLINY AKDR 
AAGGTATCGGTCTCATTCCGACAGTAAATAGTCCTGGACACATGGATGCGAT^ 

Q I g I* I PTVNfiPflHMn A I L N A 

CC ATG AAAG AATTGGG AATC C AAAACCCTAAC TTTAGCTATTTTGGG AAGAAATC AGCC C 
MKELG IQNPNPSYFGKKSAR 
GTACTGTCGATCTTaACAACGAACAAGCTGTCGCTTTTACA 

TVDLDNEQAVAFTXALIDKY 
ATGCrTGCTTATTTCGCGAAAAAGACTGAAATCTTCAACATCC^ 

AAYFAKKTEI FNIGLDEYAN 
ATGATGCGACAGATGCTAAAGGTTGGAGTGTGCTTCAAGCTGATAAATACTATC^ 

DATDAKGWSVLQADKYYPNE 
AAGGCTACCCTGTAAAAGGCTATGAAAAATTTATTGCCTACGCCAATC 

GYPVRGYEKFIAYANDLARI 
TTGTTAAAATCGCACGGTCTCAAACCAATGGCTTTTAACGACGGTA 

V K S H G -_ L _ K PMAFMDG1YYNSD 
AGACAAGCTTTGGTAGTTTTGACAAAGACATCATCGTTTCTATGTC 

tsfgsfdkdi ivsmwtg gwg 
gaggctacgatgtcgcttcttctaaactactagctgaaaaaggtcacc 

g y d v a 5 skllaekghq ilnt 
ccaatgatgcttggtactacgttcitgaacgaaacg 

ndawyyvlgrnadgqgwynl 
tcgatcaggcgctcaatgctattaaaaacacaccaatcacttctc 

dqglng ikntp itsvpktbo 
gagctcatatcccaatcatcggtggtatggtagctgcttgggc^ 

a.dipiiogmva;awadtpsar 
gttattcaccatcacgcctcttcaaajctcatgcgtcattttc^^ 

yspsrlfklmrhfananaey 
acttcgcag<ngattatgaatctgcagagcaagcacttaacx5aggt 

paadyesabqalnbvpkdln 
accgttatactgcagaaagcgtcacggccgtaaaagaagctgaaaaagctattcgctc 

RYTAESVTAVKBAEKAI R S L 
TCGATAGCAACCTTAGCCGTGCCCAACAAGATACGATTGATCAAGC 

DSNLSRAQ.QDTIDO.AIAK.LO. 
AAGAAACTGTC AAC AACTTG AC CCTC ACGCCTG AAGCTCTAAAAG AAGAAG AAGCTAAAC 
ETVNNLTLTP EALXEEEAKR 
GTGAGGTTGAAAAACTTGCCAAAAACAAGGTAATXrTCAATCGATG^ 

EVEKLAKNKVI S IDAGRKYF 
TTACTCTGAACC AGCTC AAACCGATCGTAGAC AAGGCCAGTC AGCTC^ 

TLNQLKRIVDKASBLGYSDV 
TCCATGTCCTTCTAGGAAATGACCGACTTCCCTTTCTACTCGA 

HLLLGNDGLRFLLDDMT ITA 
CCAACGGAAAAACCTATGCTAGTGATGACGTTAAAAAAGCTATTATCGAAGGAACTAAAG 
tTGKTYASDDVKKAI IEGTKA 



dues (19). Another striking feature found within the amino acid again between sites 625 and 956. This repeat is apparent in 
sequence of the p-JV-acetylglucosaminidase is a tandem repeat both nucleotide sequence and amino acid sequence self-align- 
of approximately 335 amino acids from site 180 to 522 and ments, although greater homology is conserved when compar- 
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2" IffS CTTACTACGACGLATCCAAACGGTACTGCACTAACACAGGCAGAACTAACAGAGCTAATTC 

YYDDPNGTAL TOAEVTBLIE 
2168 AATACGCTAAATCTAAGOACATCGGTCTCArCCCAGCTATTAACAGTCCAGGTCACATGG 

YAKSKDICLIPAINSPQHMn 
2228 ATWTATGCTGGTPGCCATaSAAAAATTAGGTATTAAAAA 

A MLVAMEKLGIKNPQAHFDK 
2288 AAGTTTCAAAAACAACTATGGACTTGAAAAACGAAGAAGCGATGA 

VSKTTMDLKNEEAMNPVK. AL 
2348 TCATCCCTAAATACATGGACTTCTITGC 

I GKYMDPF AGKTK I FNFGTD 
2408 ACGAATACGCCAACGATGCGACTAGTGCCCAAGGCT<X3TACTACCTCAAGTGGTATCAAC 

EYANDATSAQGWYYLKWYQL 
2468 TCTATGGCAAATTTGCCGAATATG<rCAACACCCTCOCAGCTATGGCCAAAGAAAGAGGGC 

YGKFAEYANTLAAMAKERGL 
2528 TTCAACCAATGGCCTTCAACGATC^TTCTACTATCAAGACAAGGACGATO 

QPHAPtfDG FY Y E DK D DVQ FD 
2588 ACAAAGATGTCTTGATTTCTTACTC^TCTAAAGGCTGGTGGGGATATAACCTC 

KDVLISYWSKGWWGYNLASP 
2648 CTCAATACCTAGCAAGCAAAGGCTATAAATTCTTGAATACCAACGGTGACrcGTACTACA 

QYLASKGYKPLNTNODWYY I 
2708 TTCTTGGTCAAAAACCAGAAGATCXJTGGTGGTTrcCTCAA^ 

LGQKPBDGGGFLKKAI E N T G 
2768 GAAAAAC ACC ATTC AATC AACTAGCTTCTACC AAATATC CTC AAGT AG ATCTTCC AAC AG 

K TPFNQLASTXYPBVDLPTV 
2828 TCCGAAGTATCCTTTCAATCTGCGCAGATAGACCAAGCCCTGAOTACAAGGAACAGGAAA 

GSMLS IWADRPSAEYKEEE I 
2888 TCTTTGAACTCATOACTCCCTTIXXAGACCACAACAAAGACTACTTCCGTGCTA^ 

FELMT AFADHNKDYFRANYN 
2948 ATGCTCTCCGCGAAGAATTAGCTAAAATTCCTACAAACTTAGAAGGATATAGTAAAGAAA 

ALREELAKI P T N L £ G Y S K E S 
3008 GTCTTGAGGCCCTTGACGCAGCTAAAACAGCTCTAAATTACAACCTCAACCGT 

LEALDAAKTALNYNLNRNKQ 
3068 AAGCTGAGCTTGACACACTTGTAGCCAACCTAAAAGCCGCTCTTCAA^ 

AELDTLVANLKAALQGLKPA 
3128 CTGCAACTCATTCAGGAAGCCTAGATGAAAATCAAGTGGCTCCCAATGTTC 

ATHSGSLDENEVAANVETRP 
3188 CAGAACTCATCACAAGAACTGAAGAAATTCCATTTCAAGTTATCAAGAAAGAAAATCCT^ 

ELITRTEEIPFBVIKKENPN 
3248 ACCTCCCAGCTGGTCAGGAAAATATTATCACAGCAGGACTCAAAGGTGAACGAACTCATT 

L.PAG Q B N I I T A G V K GE RTH Y 
3308 ACATCTCTGTACTCACTGAAAATGGAAAAACAACAGAAACAGTCCTTGATAGCCAGGTAA 

I SVLTBNGKT TETVLDSQVT 
3368 CCAAAGAAGTTATAAACCAAGTGGTTGAAGTTGGCGCTC 

XEVINQVVEVGAPVTHKGDE 
3428 AAACTGGTCrTGCACCAACTACTCAGCTAAAACCTAGA^ 

SGLAPTTEVKPRLDIQKEEI 
3488 TTCCATTTACCACAGTGACTCGTCAAAATCCACTCTTACTCAAAGGA^ 

PFTTVTRENPLLLK GKTQVI 
3548 TTACTAAGGGCGTCAATGGACATCGTAGCAACTTCTACTCTGTG^ 

TKGVNGHRSNFYSVSTSADG 
3608 GTAAGG AAGTGAAAAC ACTTGTAAATAGTGTCGTAGC ACAGGAAGCCGTT AC TCAAATAG 

KEVKTLVNSVVAQEAVTQ IV 
3668 TCGAAGTCGGAACTATCXSTMCACATCTAGGCGATGAAAACCXJACAAGCCGCTATTGC^ 

EVGTNVTHVGDENGQAAIAE 
3728 AAGAAAAACCAAAACTAGAAATCCCAAGCCAACCAGCTCCATGAAC^^ 

E K P K L E I PSQPAPSTAPAEE 
3788 AAAGCAAAGCTCTTCCTCAAGA1CCAGCTCCTCTGG 

SKALPQDPAPVVTEKKLPET 
3848 CAGGAACTCACGATTCTGCAGGACTAGTAGTCGCAGGACTCATGTCCACACTAGCAGCCT 

GTHDSAGLVVAGLMSTL AAY 
3908 ATX^ACTCACTAAAAGAAAAGAACACyAAGTCTTTTCGATAAAAAATAAACAGCGAGATT 

GLTKRKED* 
3968 GAA6CTCX5CTCTTTATrTTTTAA TTAATCACCTA 
4028 ACTCGTTTGGTXyTAATAAACTGGGTTGAAGAT'ITCATC^ 
4088 GATGTTACTTC^IXJAATCTGCCTCAAGAAGTGGTTTAAAGTCTACT^ 
4148 TAGGCTCTTWKX3TTGCACCAAGTCATAGGCTTGCTCACGGGTC^ 
4208 AATG TCAAC ATAGCC CGTTGGCTAAAG ATAAG AC CAAAAGTCGAGTTC ATGTTTCGG ATC 
42 68 ATATTTTCTGGGAAGACTGTCAAGTTCITCAC^ 
4328 TCAATCAAAATGGTCGTATCTCGTGTCATCAT^^ 

Fig. 3 — continued 



ing the amino acid sequences than is found between the nucle- 
otide sequences in these repeat regions. Within each of these 
tandem repeat regions lies a sequence spanning 30 amino acids 
which, when compared to each other, are 67% identical and 
93% similar when considering conservative amino acid substi- 
tutions. Interestingly, both of these 30 amino acid sequences 
have considerable homology to protein sequences found in six 
other hexosaminidases isolated from a wide variety of species 
including bacteria and humans (Table II). 

Northern Blot Analysis — A total RNA preparation from 5. 
pneumoniae was used to identify the transcript of p-N-acetyl- 
glucosaminidase. A major band hybridizing at 4.0 kb (results 
not shown) was consistent with the size of the 4.0-kb DNA 



obtained from sequencing. 

Kinetic Analysis of Recombinant and Wild-type fi-N-Acetyl- 
glucosaminidase— Clones pBStrH8 and -17, as expected, 
showed arylglycoside hydrolytic activity, but to determine that 
these had the same activity as wild type 0-JV-acetylglu- 
cosaminidase using a native biantennary iV-glycan, they were 
subcloned into the pGEX-3X expression vector (Pharmacia Bio- 
tech Inc.) for the high-level production of recombinant enzyme. 
The K m and V^m values were determined for the two recom- 
binant enzymes from clones pGXStrH17, the longest clone, and 
pGXStrH8, the most truncated clone. Wild-type enzyme dem- 
onstrated the highest affinity for the natural oligosaccharide 
with a of 132 um compared with the K m values of the 
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F10. 4.o, hydrophilicity plot of the amino terminus of strH. The strH translation product wsb analysed by a Kyte-Doolittle hydrophilicity plot. 
The first 50 amino acids of the amino terminus were analyzed for features of a signal peptide motif found in other prokaryotes. The consensus 
motifs are labeled. 6, hydrophilicity plot of the carboxyl terminus ofstrH. The lust 35 amino acids of the strH carboxyl terminus were scanned for 
features of a cytoplasmic membrane/cell wall anchorage motif common in over 20 other Gram-positive bacterial surface proteins. The features of 
the sorting signal are labeled. 

Table I 

Aligned consensus sequences at the amino and carboxy termini of proteins from Streptococcus and Staphylococcus species 
0-GlcNAc'ase is the abbreviation for p-JV-acetylglucosaminidase. Residues in other Streptococcal and Staphylococcal proteins identical to 
0-tf-acetylglucosaminidase are in boldface and conservative amino acid substitutions are underlined. 



Source (Ref.) 



Protein 



14 terminus 



C terminus 



8. pneumoniae 0-GlcNAc'ase K Q Q R F S I 

5. agalactiae (20) a antigen K * ft R F S I 

S. agalactiae (21) 0 antigen K • M R Y S I 

S. pyogenes (22) m-related protein precursor K • K N Y S L 

S. pyogenes (23) IgA receptor precursor K ft • • ¥ S L 

S. suis (24) Surface protein K ft ft * W S I 

5. aureus (25) Fibronectin binding protein N N L R Y C X 

S, aureus (26, 27) Protein A K K H I Y S I 

S. epidermidis (28) Lipase R ft N K Y S I 



RKTAVGAA 
KRFKFGAA 
R X F_S V O V A 
RKLKTOIA 
RKLKTGTA 
RKYHFGAA 
RRHKLGAA 
RKLGVGIA 



S V 

S V 

S V 

S V 

S V 

S V 

S V 

S V 



I G 

I G 

V_A 

V A 

L G 

L G 

U G 



R R F__S VGASSIXiZA 



L P 6 T 

L P A r 

L P Y V 

I» P S T 

I» P S T 

L P H T 

L P B T 

L P B V 

L P F T 



6 T 

G E 

G V 

G E 

G B 

G E 

G G 

G E 

I * 



smaller length enzymes from clones pGXStrH17 and pGX- 
StrH8, which were 2-2.5 times greater (Table III). The rates of 
hydrolysis, however, followed an opposite trend, where the 
V max measured for the smallest enzyme clone pGXStrH8, was 
4 times faster than the wild-type enzyme and 3 times faster 
than the largest recombinant enzyme from clone pOXStrH17. 

The K m and V m „ was also determined for the factor Xa cleaved 
enzymes from clones pGXStrH8 and pGXStrH17 (Table III). The 
K m of the fusion protein, from clone pGXStrH17, and the Xa- 
cleaved enzyme, StrH17, were very similar. By contrast, the 
K m values of the clone pGXStrH8 and StrH8 recombinant 



enzymes differed by a factor of two. However, the standard 
error of the Jf m of the StrH8 enzyme was quite large (Table III). 

Aglycon Specificity of Recombinant p-N-Acetylglucosamini- 
daste — Affinity-purified fusion proteins were assayed for their 
ability to hydrolyze a panel of radiolabeled oligosaccharide 
alditol substrates to compare aglycon specificity with the wild- 
type purified 0-Macetylglucosamiiudase (Fig. 5). The reaction 
products from enzyme digestions were separated by HPAEC. 
Because this technique shows a shift in retention times be- 
tween the substrate and hydrolysis product indicating cleavage 
has occurred, the structures of hydrolysis products were veri- 
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Table n 

Hexosaminidase consensus sequence comparison between six different species 
Boldface residues indicate perfect matching with the S. pneumoniae ^N-acetylglucosaminidase. Underlined residues indicate conservative 
amino acid substitutions relative to the S. pneumoniae pJV-acetylgluoosaininidaae. 



Organism (Ref.) 

S. pneumoniae 
S. pneumoniae 
Vibrio harveyi (29) 
Vibrio vulnificus (30) 
Homo sapiens (31) 
Af. musculus (32) 
D. discoideum (33) 
Candida albicans (34) 



Protein 



Hexosaminidase consensus sequence 



/S-AT-acetylglucosaminidase 
StrHS'ase 

JWV'^iacetylchitomase 

0-iV-acetylhexosajniiddase 

^hexosaminidase a-subunit 

/3-N-aratylhexoaaminidase 

0-N-acetymejcosaminidase 

/WV-acetylhexosaminidase 



T Q A B V 

* E S Q M 

S K A D Y 

T R E D Y 

T A Q D V 

T A Q D V 

S H D D_JT 



T B 

X D 

V B 

K E 

R B 

K B 

Q E 



I* I 

L I 

V I 

V I 

V V 



B r A K S K V 

N T A K D K G 

K Y A K A R W 

A T A S A_R_N 

B t A R L R G 

8 ? A R L R G 

A y A K T Y G 



QLIrAIWS 
G L I P T_V H S 
BVIPEIDM 
Q V I P S M P M 
R V L A E P D T 
R V 

r y 



L A E P D T 
E P fi I 



I P 



P 0 H H D 

P 0 H M 0 

P & H A R 

P G H S L 

P G B T L 

P O B T L 

P G B A A 



S K N D_L K Y I V P y A R A R GVRVTPE IDMPGBAR 



Table III 

Kinetic analysis of wild-type and recombinant S. pneumoniae p-N-acetylgtucosaminidase 
A desialylated, degalactosylated biantennaiy glycan was incubated with recombinant enzymes from clone pGXStrS and -17, the corresponding 
factor Xa-cleaved enzyme, StrH8 and -17 or wild-type 0-N-acetylglucosamiridase. Kinetic measurements were made as described in the text The 
length of the pGX constructs includes 232 amino acids from the 26-kDa glutathione S-transferase fusion protein tag. 



Enzyme (SSSter) Molecular mass- K m V v 



Wild-type 0-GIcNAc'ase 1311 

pGXStrH170-GIcNAc'ase 1067 

P GXStrH80-G!cNAc'ase 612 

StrHl70-GlcNAc'ase (Xa-cleaved) 835 

StrH80-GlcNAc'ase (Xa-cleaved) 380 



° The molecular mass of each enzyme was determined by SDS-PAGE. 

Bed using Bio-Gel P-4 analysis, which gave actual sizes of the 
reaction products in glucose units. The minimum enzyme con- 
centration (units/ml, using 4-MU-GlcNAc) for each enzyme 
that was chosen showed no hydrolysis of (GlcNAc/H- 
4GlcNAc)3 and thus retained exclusive 01-2 GlcNAc hydroly- 
sis. Recombinant pGXStrH8 enzyme, from the most truncated 
clone, at 0.025 units/ml and wild-type 0-2V-acetylglucosamini- 
dase at 0.012 units/ml, hydrolyzed the biantennary alditol to 
completion (Fig. 5a). Only the /31-2-Hnked GlcNAc residues 
were removed from a triantennary glycan (Pig. 56). However, 
only partial hydrolysis of a tetraantennary alditol was observed 
using the recombinant pGXStrH8 enzyme compared with wild- 
type enzyme (Pig. 5c), and no GlcNAcs were removed from 
either bisected-biantennary or bisected-hybrid oligosaccharide 
substrates (Figs. 5, d and e). Concentrations of 0.25 units/ml of 
recombinant pGXStrH8 enzyme were required to hydrolyze 
completely the susceptible jSl-2-linked GlcNAc of tetraanten- 
nary structures, but even at 45 units/ml, bisected oligosaccha- 
rides remained refractory to digestion. By contrast, the recom- 
binant enzymes from the longest and intermediate clones 
pGXStrHl7 and pGXStrH7, respectively, at low concentrations 
(0.025 units/ml) showed the same specificity as the wild-type 
enzyme, but only partially hydrolyzed those oligosaccharides 
with a bisecting GlcNAc. At a concentration of 1.5 units/ml, 
recombinant enzyme from pGXStrH17 exhibited the same ac- 
tivity as the wild-type enzyme when assayed using both bisect- 
ed-biantennary and bisected-hybrid substrates. At higher en- 
zyme concentrations (5 units/ml), this recombinant enzyme 
removed both terminal GlcNAc residues and the bisecting Gl- 
cNAc from bisected-biantennary substrates, an activity also 
achievable at high concentrations (0.1 units/ml) of the wild- 
type enzyme. The recombinant enzyme from clone pGXStrH7 
exhibited partial activity against the bisected substrates at the 
highest enzyme concentration tested (17 units/ml). Thus the 
recombinant enzyme with the greatest number of amino acids, 
pGXStrH17 (Pig. 2, 1067 amino acids including the fusion tag, 
see Table III) shared the same aglycon specificity as the wild- 
type ^-N-acetylglucosaminidase, whereas the smallest length 
clone, pGXStrH8 (612 amino acids, Table III) was completely 
unable to hydrolyze 01-2-linked GlcNAc residues from bisected 



kDa fxM pmd/min/mg 

120 132 ±11 17 ± 3.3 

121 302 ± 130 24 ± 3.7 
69 364 ±88 75 ± 6.9 
95 278 ±89 113 ± 13 
43 723 ± 510 197 ± 73 



oligosaccharide alditols at extremely high enzyme concentra- 
tions (the maximum tested was 120 units/ml of factor Xa- 
cleaved enzyme). The intermediate length clone pGXStrH7 
(641 amino acids, see Pig. 2) was partially active in hydrolyzing 
bisected substrates. A summary of these results is shown in 
Fig. 6. 

Recombinant enzymes that had the glutathione S -transfer- 
ase fusion tag removed by cleavage with factor Xa were checked 
for activity against the five substrates used in Pig. 5 to confirm 
that the fusion tag did not affect the activity of the recombinant 
fusion proteins assayed above. At the same enzyme concentra- 
tion, the factor Xa-cleaved recombinant enzymes showed iden- 
tical aglycon specificities to the uncleaved enzymes (results not 
shown), demonstrating that the fusion tag did not change the 
substrate specificity of the recombinant enzymes. 

DISCUSSION 

The peculiar aglycon specificity exhibited by p-JV-acetylglu- 
cosaminidase from S. pneumoniae not only makes it a useful 
tool for oligosaccharide sequencing, it also makes it an inter- 
esting candidate for the investigation of substrate-enzyme in- 
teractions. At low enzyme concentrations, this enzyme will only 
hydrolyze 01-2-linked GlcNAc residues and is restricted by 
further N-acetylglucosaminidase substitutions of the al-6 
mannose arm of AT-glycans and by bisecting GlcNAc residues of 
the core mannose. Little is known about the factors that might 
govern this restricted activity, though it might partly be ex- 
plained by the potential steric hindrance created by the flexi- 
bility of 1-6-linked sugars, which have the additional o> angle 
of rotation. However, because these restricted 01-2-linkages 
are cleaved by higher concentrations of /WV-aeetylglucosanuni- 
dase as well as by other hexosaminidases, the intrinsic prop- 
erties of the enzyme itself, which correlate with this narrow 
substrate specificity, remain to be clearly defined. In this paper 
we have shown that S. pneumoniae 0-iV-acetylglucosaminidase 
is encoded by a unique 3933-bp gene, strH , which terminates in 
an ochre stop codon and possesses typical Gram-positive bac- 
terial transcription initiation and termination sequences. The 
computer-translated amino acid sequence revealed two consen- 
sus motifs at both the amino and car boxy 1 terminus that are 
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Fia 5. Aglycon Specificity of ^-A^-acetylglucosaminidnso from S. pneumoniae. Both substrate and hydrolysis product elution positions 
are shown for each oligosaccharide. N, GlcNAc; if, Man. The *H-labeled oligosaccharide alditol substrates (2.5 X 10* cpm, A) were incubated with 
12.5 miljiunits/ml purified wild-type /3-W-acetylglucosamimdase (O) and 25 milliunitsAnl recombinant enzyme from clone pGXStrH8 (•) in 50 mM 
citric acid/sodium phosphate buffer, pH 6.0, containing 1 mg/ml bovine serum albumin at 37 °C for 18 h. Samples were desalted, and the hydrolysis 
products were separated by HPAEC using an eluant of 200 mM NaOH. Fractions (1 ml) were collected and scintillation counted for radioactivity. 



common to other Gram-positive surface proteins. A tandem other hexosaminidases found in a wide variety of species, sug- 
repeat was identified in the strH gene, and within each repeat seating that perhaps these amino acids may be important for 
lies a stretch of 30 amino adds homologous to sequences in six the catalytic function of the enzyme. The tandem repeat re- 
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Fig. 6. Oligosaccharide structures hydrolyzed by S. pneu- 
moniae ^-AT- acetyl ghicosaxninidase. A summary of the data pre- 
sented on the aglycon specificity of wild-type and recombinant 
p-jV-acetylglucosaniiiudase. Solid arrows, complete hydrolysis at 
appropriate enzyme concentration; dashed arrows, partial hydrolysis; 
crossed arrows, no hydrolysis at all enzyme concentrations tested See 
text and Fig. 5 for full details. N, GlcNAc; AT, Man. 

gions of protein may be correlated with the ability of the clones 
to hydrolyze substrate. Substrate specificity experiments re- 
vealed that the longest enzyme clone, pGXStrH17, was able to 
hydrolyze 01-2-linked GlcNAc residues of bisected oligosaccha- 
rides similar to the wild-type enzyme. The intermediate sized 
cloned enzyme from pGXStrH7 demonstrated only a partial 
ability to hydrolyze these substrates, whereas the enzyme from 
shortest clone, pGXStrH8, was completely deficient in this 
activity. At the protein level, the differences between these 
clones are clearly defined by the presence or absence of the 
second tandem repeat region. Recombinant enzyme from pGX- 
StrHl7 contains both tandem repeats; pGXStrH7 contains all 
of the first and 95 amino acids (of 330) of the second tandem 
repeat region; and pGXStrH8 contains only the first tandem 
repeat Sequence alignment revealed that these tandem repeat 
regions each contain a 30-amino acid consensus sequence pres- 
ent in six other hexosaminidases from divergent species, which 
leads us to speculate that tins region may be part of an ex- 
tended site of the enzyme, which is required for substrate 
orientation around the active site. In situations where this 
portion is partially missing, as with the shortest clone, efficient 
association with the some substrates may be affected and hy- 
drolysis impaired. Further insights into the function of these 
tandem repeat regions may be gained by the separate expres- 
sion of the second tandem repeat region followed by experi- 
ments addressing the substrate specificity. However, it is note- 
worthy that expression cloning failed to identify a clone that 
only contained the second repeat region and possibly indicates 
that the presence of this domain alone is not sufficient to 
maintain catalytic activity. The weaker affinity and increased 
rate of hydrolysis of biantennary oligosaccharides, together 



with changes in aglycon specificity, correlate with an amino 
acid-specific extension of the polypeptide carboxyl terminus. 
This suggests that there is either a minimum amino acid res- 
idue requirement for active site conformation, or a requirement 
for protein domains outside the active site to recognize and 
direct substrates into the enzymes catalytic center. 

A better understanding of the cellular location of the Strep- 
tococcal 0-N-acetylglucosaminidase can be derived from the 
sequence information of the gene. The identification of a char- 
acteristic Streptococcal secretion signal at the amino terminus 
and a carboxyl-terminal sorting motif found in a variety of 
Gram-positive cell-surface proteins predicts a membrane- 
bound enzyme with an active site extending into the extracel- 
lular space. This structural model explains the origin of the 
^Macetylglucosaminidase purified from the media. Several 
lines of evidence support the hypothesis that this form of the 
enzyme originates from the cell surface and is released during 
protease-assisted autolysis. First, the cell -associated enzyme 
was found to be present in much greater quantities than the 
enzyme found in the medium over an 8-h growth period (3). 
Purification of the wild-type enzyme from Streptococcal cells 
revealed multiple amino termini, and the presence of a number 
of isoforms when examined by native-polyacrylamide gel elec- 
trophoresis (3). The major protein and activity stained band of 
this preparation analyzed by SDS-polyacrylamide gel electro- 
phoresis migrates with a molecular mass of 120 kDa, 24 kDa 
smaller than the 144-kDa translated gene, providing further 
evidence that proteolysis, additional to signal peptide cleavage, 
occurs during the release of the cell-associated enzyme. 

Second, only a single gene was obtained from expression 
cloning. If more than one p-JV-acetylglucosaminidase was ex- 
pressed by this organism, then more than one gene might have 
been isolated by enzyme activity. Although 32 clones were 
selected by their ^TV-acetylglucosaminidase activity, further 
analysis of five of these selected at random revealed that they 
were overlapping fragments of the same gene. 

This evidence, together with the substrate specificity data 
(3), implies with great certainty that both forms of the 
acetylglucosaminidase are derived from the same protein lo- 
cated at the cell-surface. 

The contribution to substrate configuration made by amino 
acid residues outside the active site in glycohydrolytic enzymes 
is a largely unexplored phenomenon. Hie availability of trun- 
cated enzymes that have altered substrate specificities as de- 
scribed here provides a unique opportunity to study the mech- 
anism by which glycosidases bind and hydrolyze complex 
oligosaccharides. t 
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